The reader

The reader is a procedure available in the standard environment as the value of the variable READ-OBJECT. Conceptually, the reader coerces a stream of characters (external representation) to a stream of objects (internal representations) via a mechanism known as parsing.


\begin{inset}{}
{\tt READ-OBJECT} employs the {\tt READ-CHAR} (page \pageref{REA...
...obtained
by calling the {\tt PORT-READ-TABLE} operation on the port.
\end{inset}

The reader works as follows:

Any whitespace characters (space, tab, newline, carriage return, line feed, or form feed) are read and ignored. A non-whitespace character is obtained; call it c.

If c is a read-macro character, the reader invokes a specialist routine to handle a syntactic construct introduced by the read-macro character.

If c is not a read-macro character, then characters are read and saved until a delimiter character is read. A delimiter character is either a whitespace character, or one of the following: ( (left parenthesis), ) (right parenthesis), [, ], {, }, or ; (semicolon). If the sequence of characters beginning with c and going up to but not including the delimiter is parsable as a number, then the sequence is converted to a number, which is returned. Otherwise the sequence is converted to a symbol.

The escape character, backslash ( \), may be used within a run of constituent characters to unusual characters in a symbol's print name. In this case, the escaped character (i.e. the character following the escape character) is treated as if it were a constituent character, and is not converted to upper case if it is a lower case letter. For example:
\begin{codexenv}
abc{$\backslash$};def {\rm reads the same as} \char93 [Symbol '...
...lash$}'12345 {\rm reads the same as} \char93 [Symbol ''12345''']
\end{codexenv}

The following are standard read-macro characters:

"
Doublequote: introduces a string. Characters are read until another doublequote character is found which does not immediately follow a backslash ( \) and a string is returned. Within a string, backslash acts as an escape character, so that doublequotes and backslashes may appear in strings.

'
Quote: 'object reads the same as (QUOTE object).

(
Left parenthesis: begins a list.

)
Right parenthesis: ends a list or vector, and is illegal in other contexts.

`
Quasiquote: see section [*].

,
Comma: this is part of the backquote syntax.

@
At sign: this is part of the backquote syntax.

;
Semicolon: introduces a comment. Characters are read and discarded until a newline is encountered, at which point the parsing process starts over.

#
Sharp-sign: another dispatch to a specialist routine is performed according to the character following the #.

Standard sharp-sign forms:

# \
Character syntax. See section [*].

#x
Hexadecimal input. An integer following the #x is read in base 16.

#o
Octal input. An integer following the #o is read in base 8.

#b
Binary input. An integer following the #b is read in base 2.

#(... )
Vector. The elements of a vector are read between the parentheses, and the vector is returned.

#[... ]
This syntax is used for certain kinds of re-readable objects. It also provides an alternate syntax for characters and symbols. The brackets enclose a sequence of objects; the first should be a symbol which keys the type of the resulting object, e.g. CHAR or SYMBOL. For example,
\begin{codexenv}
\char93 [Ascii 65] {\rm represents the same object as} {\char93...
... \\
\char93 [Symbol FOO] {\rm represents the same object as} FOO
\end{codexenv}
This syntax is used by the printer when necessary, for example:
\begin{codexenv}
(STRING-$>$SYMBOL ) $\Longrightarrow$\ \char93 [Symbol ] \\
(ASCII-$>$CHAR 128) $\Longrightarrow$\ \char93 [Ascii 128]
\end{codexenv}

#{... }
This is the syntax used by the printer for objects which have no reader syntax. When the reader encounters the sequence #{ it signals an error.